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ABSTRACT 

There are currently three main approaches to 
parameter estimation in item response theory (IRT): (i) joint maximum 
likelihood, exemplified by LOGIST, yielding maximum likelihood 
estimates; (2) marginal maximum likelihood, exemplified by B I LOG , 
yielding maximum likelihood estimates of item parameters (ability 
parameters can be estimated subsequently, using Bayesian procedures); 
and (3) Bayesian approaches — parameter estimates are usually the mode 
or mean of the posterior distribution of the parameter estimated. 
Advantages and disadvantages of these three methods are discussed and 
compared. (Author/BW) 
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Maximum Likelihood and Bayesian Parameter Estimation in Item Response Theory* 

There are currently three main approaches to parameter estimation in 
item response theory (TRT): 

1. Joint maximum likelihood, exemplified by LOGIST, yielding maximum 
likelihood estimates (Wingersky, 1983). 

2. Marginal maximum likelihood, exemplified by B1L0G. This approach 
j currently yields maximum likelihood estimates of item parameters . 

Ability parameters can be estimated subsequently, using bayesian 
procedures (Misievy & bock, 1981). 

3. Bayesian approaches: parameter estimates are usually the mode (or 
mean) of the posterior distribution of the parameter estimated 
(Swaminathan & Gifford, in press). 

The quantity maximized by each approach is shown below for a test of n 

items administered to N examinees. P (8 ) is the probability of success on 

l a 

item i for examinee a at ability level 6 a , O^ 8 ^ = 1 " p j/ 9 a ) ' u i a 
is the response of examinee a to item i , assumed here to be either 0 or 1 , 
and g( ) denotes a prior distribution ot parameters. 
Joint maximum likelihood: 

N n u 1-u, 

Maximize L(6;a,b,c) - It H [P. (8 ) ] 13 [0, (8 ) 1 (1) 
- - - - a«l i = l 1 3 1 3 

N n 

or log L(B;a,b,c) = I E [u ia log P^BJ + (1 - u.J log Q^)] 
m a= 1 i»l 



*This work was supported in part by contract N00014-83-O0457 , 
project designation NR 150-520 between the Office of Naval Research and 
Mutational Testing Service. Reproduction in whole or in part in 
permitted tor any purpose of the United Status Government. 
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Marginal maximum likelihood of item parameters : 

N 00 

Maximize L(a,b,c) = H / g(8 ) L(8 ;a,b,c) d6 . (2) 
- ~ ~ a =l — 3 3 " " " 3 

Bayesian modal estimation: 

. Maximize f (6 ;a ,b ,c ) = L(6 a ,b ,c )g { (8 )g 2 (a ,b ,c) (3) 

or log f(6;a,b,c) = log L(8;a,b,c) + log g.(8) + log g 9 (a,b,c) . 

LOfJIST tinds the ability and item parameter values that maximize the 
likelihood tunction of the observations. Bayesian methods typically multiply 
t^is likelihood by a prior for each of the parameters, obtaining the joint 
posterior distribution of the parameters, which are usually assumed to be 
independently distributed. The Bayesian modal estimates (BME) of all the 
parameters are the values at the mode of this joint posterior distribution. 
Marginal maximum likelihood multiplies the original likelihood by a prior on 
ability, eliminates the ability parameters by integration, and obtains 
maximum likelihood estimates of the item parameters by maximizing the 
resulting 'marginal' likelihood function. Supplementary Bayesian procedures 
may be used to obtain ability parameter estimates. Bayesian priors on item 
parameters may also be used in MMLE. 
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Whfcn approximately parallel test forms are administered year after yaar 
to similar populations of examinees, it becomes possible to deduce appropriate 
prior distributions for the item and the ability parameters from past results. 
In such a situation, Bayesian procedures should certainly yield better 
parameter estimates than maximum likelihood, since Bayesian procedures make 
use of more information. Ever, in the absence of data from previous 
administrations, Bock's BILOG is able to work with a reasonable prior 
distribution of ability generated directly just from the current data. 

Marginal maximum likelihood has an important advantage over joint maximum 
likelihood, since it can estimate item parameters without having to estimate 
ability parameters. The advantage is not one of computational speed but 
rather of theoretical accuracy. When there are one or two thousand examinees 
and '+0 or more items per person, there will be a little difference between 
the estimates. In cases where there are only 10 or 15 items per person, 
joint maximum likelihood will obtain biased estimates of ability parameters, 
especially at low r clity levels. This then causes the item parameters to be 
misestimated, even though Ae number of examinees per item is large. 

Let us turn now to Bayes ian procedures • From the mathematical 
statistician's point of view, one clear virtue of Bayesian methods is that it 
the posterior mean is used as a parameter estimate, this estimator minimizes 
the overall mean squared error of estimation, provided the appropriate prior 
distribution is UBed. In the case of ability parameters, for example, the 
quantity minimized is 
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MSE = g(8 - 6 ) 2 , (*> 
a a 

where 8 is an estimate of 8 and » denotes expectation over all examinees. 
The posterior mean achieves this important result by accepting increased 
estimation bias in return for reduced MSE. The BME does not minimize this MSE 
unieus the mode of the posterior distribution coincides with its mean, which is 
not the case in 1RT estimation problems. Nevertheless, the BME may be close to 
the posterior mean. 

Why dues the posterior mean do better than the maximum likelihood estimate 
(MLE) in minimizing MSE? when item parameters are known, the MLE of ability 
assigned to a given response pattern must always be the same. In Bayesian 
methods, however, the ability estimate assigned to a given response pattern 
depends on the characteristics of the entire group analyzed. It is this 
additional flexibility that allows Bayesian methods to obtain a smaller MSE. 

Figure I shows the bias, estimated by asymptotic formula (7) accurate 
through terms of order 1/n , for the BME of 8 (dashed curve) based on a 
normal prior and for the MLE of 8 (solid curve), calculated for an 90-item SAT 
Verbal test that first came to hand, using the three-parameter logistic model. 
The values shown in the figure assume the item parameters to be known. 

The MLE and the BME are biased in opposite directions. The BME is more 
biased than the MLE. Note that neither bias is linearly related to 8 ; 
consequently, the bias cannot be corrected by a simple linear transformation 
of the estimates. 

When a normal prior is used for 8 , the asymptotic standard error of the 
BME and of the MLE for estimated 8 are identical to the usual order of 
apptoxlm.it ion ( L/n ). The (familiar) formula for both asymptotic standard 
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Figure 1, Bias in estimated ability for an 90~item SAT Verbal test. 
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errors is 



n p ; 2 -1/2 

S.E.(6) «■ ( £ -p— ) 17/1 , (5) 
1 = 1 i i 



the square root, of Che reciprocal of the test, information function (I). This 

Is also the asymptotic formula for hoth MSE's. If the S.E. were written out 

including higher order terms, the Bayesian S.E. would be smaller than the 

2 

maximum likelihood S.E, by an amount of order 1/n , 

The asymptotic bias in the MLE in the three-'parameter logistic model is 
(Lord, 1983) 

D " 1 
Bias(MLE(8 ) ) » —s- E a.I.U -j) (6) 

X L i=1 1 1 1 I 

F 'l 3F i P i " c i 

where D-1.7, \ * J— % PJ S ^ , ^ E c / • 

i x i 

and and are the discrimination parameter and lower asymptote for 

item l ♦ The asymptotic bias for the BMt! is found by the same method to be 



Bias(BME(8)) = bias (MLt(9 ) ) -y (7) 

hoth (6) and (7) are of order 1/n • 

Because of the bias in the HME, which is best described as regression 
towards the niean , the variance of the BMt across examinees is less than the 
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variance of the true 6 values. Many people apply a linear transformation 
to the KME's in an attempt to make the variance (across examinees) of the 
resulting transformed estimates equal to the variance of the true 0 values. 
This procedure is only a rough approximation, since it is based on an 
assumption of linear regression of BME on 6 , whereas the true regression 
is curvilinear. From the mathematical statistician's point of view, 
a linearly transformed bME or posterior mean, or a curvilinearly transformed 
BME or posterior mean, are nonstandard types oi estimators. Such transformed 
estimators no longer have the property of minimizing MSE. 

A further problem arises: 'Minimizing MSE on the 0 scale is not the 
same as minimizing MSE on the true-score scale, or on some other transformed 
ability scale. Bayesian estimates of ability will differ in a substantive 
way depending on t-ie scale chosen for measuring ability. This problem does 
not a r ise in maximum likelihood estimation. 

Although minimizing MSE on the 0 scale seems appropriate to many people, 
the writer believes it is inappropriate. Large differences in 0 f s at the 
extremes oi the ability scale are of very much less importance for most 
practical purposes than smaller differences in the middle of the scale. Ii 
the extremes of the scale were important to us, we would be putting more easy 
items or more hard items in our tests. An average of squared differences, 
averaged over all parts of the scale, is thus not of real interest. A 
procedure that attempts to minimize such an average will devote most effort to 
minimizing the large squared differences found at the extremes of the scale, 
thus partially neglecting more important differences near the middle of the 
sea le. 
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I? the case of item-parameter estimation, the idea of minimizing the MSE 
of the item parameter estimates seems inappropriate for a different reason. 
If ^he item parameter estimates arc to be used for equating, for example, the 
appropriate quantity to minimize is the squared error in the final equating 
tables, Wot the MSE of the item parameters. If the items are to be used for 
subsequent adaptive testing, the appropriate criterion is a mean squared 
error of the resulting examinee score the adaptive test* 

A thought-provoking circumstance is the following: Suppose the true prior 
distribution of each item parameter and ability parameter were known. Given 
repeated testings over a few years, we can actually come close to this. The 
leading Bayesian IRT practitioners prefer not to use such a • tight* prior; they 
prefer to use a more diffuse prior that produces less regression of the 
estimates towards the mean. This attitude derives from practical 
considerations rather than from Bayesian logic • 

Use of Bayesian priors, even diffuse priors, has several practical 
advantages that are widely appreciated; 

1. Ability estimates ( 8 ) on the 8 scale are automatically restricted 
to a reasonable range. Infinite estimates do not occur. 

2. Item discrimination parameter estimates never try to become infinite. 
3 • Ls timated lower asymptotes do not come out at implausible values , 

even in the case of very easy items that provide no relevant data 

for estimating the asymptotes • 
The last two advantages convince the writer that Bayesian priors should 
probably be used for the discrimination and the lower -asymptote parameters • 
Regression towards the mean in estimates of these parameters has less serious 
implications than in the case of the ability and difficulty parameters. 
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When ability parameter estimates are regressed towards the mean, an 
examinee's score ( 6 or some transformation of 6 ) depends not only on the 
examinee's test perf orjiance , but also on the nature of the entire group in 
which he or sne happens to be included. If the group as a whole is a low- 
ability group, the examinee's score may be regressed downwards; if it is a 
high-ability group, the examinee's score may be regressed upwards • If the 
group is heterogeneous, the regression effect may be small; if the group is 
homogeneous, the regression effect could be large. If the test is long and 
reliable, the regression of scores may be relatively small; if the test is 
short and unreliable, the regression effect could be of serious concern. 

We need more practical experience in dealing with these problems in real 
situations. If our work deals with a single test and a single group of 
examinees, regressed ability estimates may pose no problem, because the rank 
order of the examinees' scores will be little affected. If our work deals with 
comparisons of individuals across groups and across tests, with data analyses 
made at different times, we may want more experience before we decide exactly 
how to obtain acceptable results for large-scale testing programs. Bayesian 
methods may be the ultimate recourse, but we need considerable experience with 
them before we can be sure how to use them safely. 
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2122 Ccolidge St, 
Lansing, Hi 48906 

1 Dr. Huynh Huynh 
College of Education 
University of South Carolina 
Columbia, SC 2920B 

1 Dr. Douglas H, Jones 
Advanced Statistical Technologies 
Corporation 
10 Trafalgar Ccurt 
Lawrewille, NJ 0B14B 

1 Profess: r John A. Keats 
Departae r .t of Psychology 
The UMversity of Newcastle 
N.S.W. 2308 
AUSTRALIA 

1 Dr. Killiaa Koch 
University of T -^-Austin 
Measurement and Evaluation Center 
Austin, TX 78703 

! Dr, Thoaas Leonard 
c/o Dr. flelvin R, Novick 
Lindquist Center for Heasureaent 
University of Iowa 
Iowa City, 1A 52241 

1 Dr. Alan Lesoold 
Learning RiD Center 
University of Pittsburgh 
3939 C'Hara Street 
Pittsburgh, PA 15260 

1 Dr. Michael Levine 
Departaent of Educational Psychology 
210 Education Bldg. 
University of Illinois 
Chaapaign, 1L 61801 
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Private S*ctGr 

1 D r . Charles Lewis 
Faculteit Sociale Wetenschappen 
R; jfesuni versi tei t Broningen 
Oude Potenngest'aat 23 
97126C 6rorii.-,gen 
Netherlands 

1 Dr. Robert Linn 
CollegE o' Education 
University cf Illinois 
Urbana, IL 61801 

1 Dr. Robert Locrfiian 
Center for Naval Analysis 
2»0 Nor Beauregard St. 
He'Miria, VA 22311 

1 Dr. Frederic B. Lord 
Educational Testing Service 
Princeton, NJ 03541 

1 Sr. Jace= Lunsden 
Depa r *.*ent of Psychology 
University of Western Australia 
Nedlands H.A. A009 

mi* j i Hml in 



Private Sector 

1 Dr, Kelvin R. Novick 
336 Linrvist Center for Heasuraent 
University of Iowa 
Iowa City, IA 52242 

1 K Jaaes Olson 
UI CAT, Inc. 

1875 South State Street 
Orea, UT B4057 

1 Wayne H. Patience 
Ane'ican Council on Education 
BED Testing Service, Suite 20 
One Dupont Cirle, N* 
Washington, DC 20036 

1 Dr. Jaaes Paulson 
Dept. of Psychology 
Portland State Uni ve3rsi ty 
P.O. Box 751 
Portland, OR 97207 

1 Dr. Hark D. Recuse 
ACT 

P. 0. Box 168 
Iowa City, IA 52243 



1 Dr. Ba r y Marco 
Stop 3K 

Educational Testing Service 
Princeton, NJ 0845! 

1 Br. Robert Hck'inley 
American College Testing Prograas 
P.O. Bo* 168 
!o»a City, IA 5224: 

1 Dr. Sa-ta r a leans 
HlSo^ Pesojrces Research Organization 
300 North HaS*i p : gtop 
Alexandria, VA 22314 

1 Dr. Robert Nislevy 
711 Illinois Street 
Geneva, IL 60134 

1 Dr, W. Alar, Nicenarder 
University of Oklahoaa 
Department of Psychology 
Okiahoaa City, OK 73069 



1 Dr. Lawrence Podner 
403 El a Avenue 
TaLota Park, HD 20012 



1 Dr. J. Ryan 
Dtpa r taer : t of Education 
University of Sout^ Carolina 
Columbia, SC 292C6 

1 PROF. P'JKIKO SAMEJZMA 
DEPT. OF PSYCHOLOGY 
Uni VERS JTY OF TENNESSEE 
M3MLLE, TN 37916 

1 Fra«\tJL. Scheldt 
Peparteert of Psychology 
Pldg.'GG 

George Washington University 
Washington, DC 20052 

1 Lowell Schoer 
Psychological \ Quantitative 

Foundations 
College of Education 
Urn ,ersity of Iowa 
low* City, IA 52242 
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Private Sector 

1 Dr. Kazuo Shige^asu 
7-9-24 Kugenuaa-Kasgan 
Fujusawa 251 
JAPAN 

1 Dr, Niiliaa Sifts 
Center for Naval Analysis 
200 North Beauregard Street 
Alexandria, VA 22311 

1 Dr. H. Wallace Sinaiko 
Prograt Director 

Ka^pOmrf Research and Advisory Services 
Saithsonian Institution 
301 North Pitt Street 
Alexandria, VA 2231* 

1 Martha Stocking 
Educational Test i Service 
Princeton, NJ 08341 

I Dr. Peter Stoloff 
Center for Naval Analysis 
200 North feau'ecard Street 
Alexandria, VA 22311 

1 Dr. Hiliiai Stout 
Un; versify of Illinois 
Department of Mathematics 
Urbana, 1L 61801 

1 Dr. Hanharan Swaminathan 
Labo^ator/ of Psychosetric and 
Evaluation Fesea r ch 
School ot Education 
University of Massachusetts 
Amherst, HA 01003 

1 Cr, Klkumi Tatsuoka 
Computer Based Education Research Lab 
252 Engineering Research Laboratory 
Urbana, IL 61301 

1 Dr. Maurice Tatsuoka 
220 Education Bldq 
1310 S, Sixth St, 
Champaign, IL 61820 

1 Dr. David Thissen 
Department of Psychology 
University of Kansas 
Laurence, KS 66044 



Private Sector 

BEST 

1 Mr. Gary Thoeiasson 
University of Illinois 
Department of Educational Psychology 
Champaign, IL 61820 

1 Dr. Robert Tsutakawa 
Departfent of Statistics 
University of Missouri 
Columbia, MO 65201 

1 Dr, Ledyard Tucker 
University of Illinois 
Department of Psychology 
603 E. Daniel Street 
Champaign, IL 61820 

1 Dr. V. R, R. Uppuluri 
Union Carbide Corporation 
Nuclear Division 
P. 0. Box Y 
Cak R;d;e, TN 37330 

1 Dr. David Vale 
Assessment Systems Co't^ation 
2233 University Avenue 
Suite 310 

St. Paul, KN 55114 

1 Dr, Hcaa'ci Has ner 
Division of Psychological Studies 
Educational Testing Service 
Princeton, NJ 08540 

1 Dr. Ming-Mei Hang 
Lindquist Center for Measurement 
University of Iowa 
Iowa City , IA 52242 

1 Dr, Brian Haters 
HuaRRO 

300 North Washington 
Alexandria, VA 22314 

1 Dr. David J. Weiss 
N660 Elliott Hall 
University of Minnesota 
75 E. River Read 
Minneapolis, UN 55455 

1 Dr, Rand R. Hilcox 
University of Southern California * 
Departrent of Psychology 
Los Angeles, CA 90007 
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Private Sector 

1 Genan Military Representative 
ATTN: WoHgang Kildegrube 

Streitkraefteait 

0-5300 Bonn 2 
4000 Brandynine Street, NN 
Washington , DC 20016 

1 Dr. Bruce Hi Hi lis 
Department of Educational Psychology 
University of Illinois 
Urbana, IL 61801 

1 fls. Banlyn Hinqersky 
Educational Twtinq Service 
Princeton, NJ 08541 

1 Dr. Beorge Hong 
Biostatistics Laboratory 
He»orial Slcan-Kettenng Cancer Center 
12 7 5 York Avenue 
Ne« York, NY 10021 

1 Dr. Wendy Yen 
CTB/HcGraw Hill 
Del Honte Research Park 
Monterey, CA 93940 



